专利摘要:
An artificial cochlea ambient sound sensing method and system. The method comprises the following steps: a sound acquisition module acquires ambient sound in real time by using a microphone, and outputs an acquired discrete sound signal to a sound feature extraction module (S10); the sound feature extraction module processes sound signals transmitted by the sound acquisition module, extracts one group of feature values representing sound signal characteristics, and outputs the feature values to a neural network classification module (S20); the neural network classification module performs classification on the group of feature values by means of a trained neural network after one group of feature values extracted by the sound feature extraction module are received, and then outputs the classification result to an integrated decision module (S30); the integrated decision module integrally analyzes and determines the current scene after the classification result of the neural network classification module is received, and outputs the determination result to a voice processing selection module (S40); the voice processing selection module selects the optimal voice processing program and parameter configuration thereof according to the determination result of the integrated decision module for the current scene (S50).
公开号:ES2849124A1
申请号:ES202190003
申请日:2019-07-19
公开日:2021-08-13
发明作者:Xiaowei Zhang;Yan Han;Xiaoan Sun;Sui Huang
申请人:ZHEJIANG NUROTRON BIOTECHNOLOGY CO Ltd;
IPC主号:
专利说明:

[0001] AMBIENT SOUND DETECTION SYSTEM AND METHOD FOR AN IMPLANT
[0004] Technique field
[0006] The present invention belongs to the field of signal processing and relates to a
[0007] method and an ambient sound detection system for a cochlear implant.
[0009] Background of the invention
[0011] A cochlear implant is a unique medical device on the market today that can effectively restore hearing in patients with severe or extremely severe deafness. The principle of operation of the common cochlear implant is that a sound signal acquired by a microphone is converted into a stimulation code through a signal processing unit and sent to an implant; and the implant stimulates an auditory nerve through a microelectrode according to the stimulation code, allowing to restore the hearing of a patient with the implant. Like other assistive listening devices, such as a hearing aid, such a common cochlear implant system does not play an important role in the hearing system of normal people, specifically the role of distinguishing a target signal in a complex auditory scene and extracting it. , for example, it does not allow the patient to hear clearly what a conversation partner is saying in a group of people or in a relatively noisy environment. A general solution is to reduce the influence of noise on listening using a noise removal algorithm. However, the denoising algorithms and their parameter settings are different in different environments (such as a pure speech environment, a noisy speech environment, or a noisy environment).
[0013] To solve such problems, an ambient sound detection algorithm is introduced, and the system can start the noise reduction algorithm pertinently and set the relevant parameters based on a decision result of the ambient sound detection algorithm. An ambient sound detection algorithm classifier adopts a hidden Markov model in an incipient hearing aid or cochlear implant system, and the model is relatively simple, theoretically mature before, does not require large training data, has a certain recognition rate correct, it is relatively small in operation and can be adapted to the cochlear implant with limited operational capacity. With continuous innovation in pattern recognition, machine learning, and other fields in recent years, as well as continuous progress in computational algorithms, more classification algorithms (such as a vector support machine and a neural network) have acquired relevance in the field of ambient sound detection and have a higher classification accuracy rate. Additionally, compared to the hidden Markov model, the vector support machine classifiers and the neural network focus on distinguishing categories and do not need to provide a prior probability of category conversion. That is, it is only necessary to analyze the data of different ambient sounds, without considering the probability of converting one ambient sound into another. It is very difficult to get the conversion probability and the data analysis is not accurate enough. However, the neural network changes a lot, and depending on the number of values of the input characteristics, the number of hidden layers, and the number of network nodes in each layer, its network structure can have many combinations. On the other hand, since the neural network's classification accuracy rate is generally proportional to its scale, the amount of operation required is relatively large.
[0015] Summary of the invention
[0017] To solve the above problem, targeting the deficiencies in the existing sound detection processing, the present invention provides an ambient sound detection method for a cochlear implant and uses a neural network to classify ambient sound, where the values of the input characteristics and a network structure of the neural network are optimized in a cochlear system. That is, as long as a certain rate of classification accuracy is met, an amount of operation is minimized.
[0019] In order to achieve the above objective, the technical solution of the present invention provides an ambient sound detection method for a cochlear implant. The method includes the following stages:
[0021] acquiring ambient sound in real time by a sound acquisition module using a microphone, and sending a series of acquired discrete sound signals to a sound feature extraction module by the sound acquisition module;
[0023] process the sound signal transmitted from the sound acquisition module using the sound feature extraction module, extract a set of feature values representing attributes of the sound signal using the sound feature extraction module, and send the feature values to a network classification module neuronal using the sound characteristics extraction module;
[0025] classify the set of feature values by the neural network classification module through a trained neural network after the neural network classification module receives the set of feature values extracted by the feature extraction module from sound, and send a classification result to an integral decision module by the classification module of the neural network;
[0027] determine a current scene through comprehensive analysis by the comprehensive decision module after receiving the classification result from the neural network classification module, and send a determination result to a speech processing selection module by the module integral decision; and
[0029] selecting an optimal speech processing program and a parameter setting thereof by the speech processing selection module based on the determination result of the integral decision module for the current scene.
[0031] Preferably, the microphone for acquiring ambient sound in real time is an omnidirectional microphone or a microphone array.
[0033] Preferably, a sampling rate of the sound acquisition module is 16 K.
[0035] Preferably, the number of values of the extracted feature set representing the attributes of the sound signal is 8.
[0037] Preferably, the neural network classification module adopts a deep neural network or a delayed neural network containing two hidden layers and 15 neurons in each of the hidden layers.
[0039] Preferably, the 8 characteristic values are screened from 60 values of features.
[0041] Preferably, for the screening of characteristic values, the integral analysis of the statistical values of the characteristic values is adopted, as well as a Gaussian mixture model, an average influence value algorithm, a sequential direct selection algorithm and a method classifier training results evaluation.
[0043] Preferably, a calculation amount of the characteristic values and a calculation amount of the neural network does not exceed 20% of the operational capacity of a speech processor of the cochlear implant.
[0045] Based on the above objective, the present invention further provides an ambient sound detection system for a cochlear implant. The system includes a sound acquisition module, a sound feature extraction module, a neural network classification module, an integral decision module, and a speech processing selection selection module that are connected sequentially, in where
[0047] the sound acquisition module is configured to acquire ambient sound in real time using a microphone and send a series of acquired discrete sound signals to the sound characteristic extraction module;
[0049] the sound feature extraction module is configured to process the sound signal transmitted from the sound acquisition module, extract a set of feature values representing attributes of the sound signal, and send the feature values to the classification module of the neural network;
[0051] The neural network classification module is configured to classify the set of feature values through a trained neural network after receiving the set of feature values extracted by the sound feature extraction module and sending a classification result to the integral decision module;
[0053] the integral decision module is configured to determine a current scene through an integral analysis after receiving the classification result from the neural network classification module and sending a determination result to the speech processing selection module; and
[0054] the speech processing selection module is configured to select an optimal speech processing program and a parameter setting thereof based on the determination result of the integral decision module for the current scene.
[0056] Brief description of the drawings
[0058] Figure 1 is a flow chart of the steps of an ambient sound detection method for a cochlear implant in accordance with one embodiment of the present invention;
[0060] Figure 2 is a structural block diagram of an ambient sound detection system for a cochlear implant in accordance with one embodiment of the present invention;
[0062] Figure 3 is a specific schematic diagram of a neural network classification module of the ambient sound detection method and system for the cochlear implant in accordance with an embodiment of the present invention; and
[0064] Fig. 4 is a comparative diagram of computational quantities and precision rates of networks with different hidden layers and different number of neurons in the ambient sound detection method for the cochlear implant according to an embodiment of the present invention.
[0066] Detailed description
[0068] To make the objectives, technical solutions and advantages of the present invention
[0069] clearer, the present invention will now be described in greater detail with reference
[0070] to the accompanying drawings and embodiments. It should be understood that the specific realizations,
[0071] described herein, are merely illustrative of the present invention and not
[0072] they are intended to limit the present invention.
[0074] Rather, the present invention covers any alternative, modification, method and
[0075] equivalent solutions made within the spirit and scope of the present invention such and as defined in the claims. Furthermore, in order to allow the public to better understand the present invention, some specific aspects are described in detail in the following detailed description of the present invention. Those skilled in the art will be able to fully understand the present invention without these detailed descriptions.
[0077] Referring to Fig. 1 which shows a flow chart of the steps of a technical solution of an ambient sound detection method for a cochlear implant according to an embodiment of the present invention, the method includes the following steps.
[0079] In S10, ambient sound is acquired in real time by a sound acquisition module using a microphone, and the sound acquisition module sends a series of acquired discrete sound signals to a sound feature extraction module.
[0081] In S20, the sound signal transmitted from the sound acquisition module is processed by the sound characteristics extraction module, a set of characteristic values representing attributes of the sound signal are extracted by means of the extraction module. of sound characteristics and the values of the characteristics are sent to a classification module of the neural network by the extraction module of sound characteristics.
[0083] In S30, the set of characteristic values is classified by the neural network classification module through a trained neural network after the neural network classification module receives the set of characteristic values extracted by the neural network module. extraction of sound characteristics and the classification result is sent to an integral decision module using the neural network classification module.
[0084] At S40, a current scene is determined through comprehensive analysis by the comprehensive decision module after receiving the classification result from the neural network classification module, and the determination result is sent to a processing selection module. voice using the integral decision module.
[0086] At S50, an optimal speech processing program and a parameter setting thereof are selected by the speech processing selection module based on the determination result of the integral decision module for the current scene.
[0088] An embodiment of the system of the present disclosure is shown in Figure 2. The system includes a sound acquisition module 10, a sound feature extraction module 20, a neural network classification module 30, an integral decision module 40, and a speech processing selection module 50 that are connected sequentially.
[0090] Sound acquisition module 10 is configured to acquire ambient sound in real time using a microphone and send a series of acquired discrete sound signals to sound feature extraction module 20.
[0092] The sound feature extraction module 20 is configured to process the sound signal transmitted from the sound acquisition module, extract a set of feature values representing the attributes of the sound signal, and send the feature values to the module. classification of the neural network 30.
[0094] The neural network classification module 30 is configured to classify the set of feature values through a trained neural network after receiving the set of feature values extracted by the feature extraction module. sound and send a classification result to the integral decision module 40.
[0096] The integral decision module 40 is configured to determine a current scene through an integral analysis after receiving the classification result from the neural network classification module and sending a determination result to the speech processing selection module 50.
[0098] The speech processing selection module 50 is configured to select an optimal speech processing program and a parameter setting thereof based on the determination result of the integral decision module for the current scene.
[0100] In a specific embodiment, the microphone for acquiring the real-time ambient sound from step S10 is an omnidirectional microphone or a microphone array, and the sampling frequency of the sound acquisition module 10 is 16 K.
[0102] The number of the set of extracted feature values representing the attributes of the sound signal in step S20 is 8. The 8 feature values are screened from 60 feature values. Normalization is done before extracting the characteristic values and the equation is as follows:
[0106] = ^ ymax - / 'v-mm '
[0107] a, = - X - m --- á - x - + --- X - m - í - n ,
[0108] ^ Xmax - Xmin
[0109] where xnorm is a normalized result, Xmax is a maximum value of a training sample of the characteristic values and Xmin is a minimum value of the training sample of the characteristic values.
[0110] The neural network classification module in step S30 adopts a deep neural network or a delayed neural network containing two hidden layers and 15 neurons in each of the hidden layers. The neural network module is trained through a large number of data samples. Taking as an example one in which four types of environmental sounds are determined (such as pure voice; noisy voice; noise; and music and silence), the neural network module is shown in figure 3. The values 1,2 are selected , 3, 4, 5 and 6 as characteristic values and globally form a set. Training samples are drawn from a large number of acquired audio files that contain 144,000 sets of sample feature values, and each type of ambient sound contains 36,000 sets of feature values. To find a balance between the amount of operation and the accuracy rate, referring to Fig. 4, a hidden layer 1 and a hidden layer 2 are tested, having, respectively, a different number of neurons. In the figure it can be seen that the precision rate of the neural network with two hidden layers is obviously higher than that of the neural network with one hidden layer, and the optimal number of neurons is 15.
[0112] A determination equation for the neural network in step S40 is as follows:
[0114] active idFcn (W X Xinput + ^
[0115] < d 2 = activeFcn (W 2 XH 1 B 2)
[0116] l Ysaiida = activeFcn (W 3 X t í 2 + B 3 )
[0118] where XInput is an array of input feature values; W1, W2 and W3 are multilayer weighting matrices of the trained neural network; B1, B2, and B3 are multi-layer deviation matrices of the trained neural network, activeFcn is an activation function, and Youtput is a result of the network computation.
[0120] To reduce the amount of operation, the activeFcn H activation function of the hidden layer
[0121] and the activation function activeFcn O of an output layer are defined as follows:
[0124] where x is the input of the trigger function and i is a category of ambient sound.
[0126] After receiving the classification result from the neural network classification module, the comprehensive decision module comprehensively analyzes a number of factors including mainly a neural network recognition result and the magnitude of the sound energy in a short time period, determines the current scene and sends the determination result to the voice processing selection module.
[0128] The speech processing selection module selects an optimal speech processing program and a parameter setting thereof based on the result of the determination of the integral decision module for the current scene.
[0130] For the screening of characteristic values, a comprehensive analysis of the statistical values of the characteristic values is adopted, as well as a Gaussian mixture model, an average influence value algorithm, a sequential direct selection algorithm and an evaluation method. of classifier training results.
[0132] The calculation amount of the characteristic values and the neural network calculation amount do not exceed 20% of the operational capacity of a cochlear implant speech processor.
[0134] The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent substitution and improvements made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
权利要求:
Claims (9)
[1]
1. An ambient sound detection method for a cochlear implant, the method comprising the following steps:
acquiring ambient sound in real time by a sound acquisition module using a microphone, and sending a series of acquired discrete sound signals to a sound feature extraction module by the sound acquisition module;
process the sound signal transmitted from the sound acquisition module by the sound feature extraction module, extract a set of feature values representing attributes of the sound signal by the sound feature extraction module, and send the values of the characteristics to a neural network classification module by means of the sound characteristics extraction module;
classify the set of feature values by the neural network classification module through a trained neural network after the neural network classification module receives the set of feature values extracted by the feature extraction module from sound, and send a classification result to an integral decision module by the classification module of the neural network;
determine a current scene through comprehensive analysis by the comprehensive decision module after receiving the classification result from the neural network classification module, and send a determination result to a speech processing selection module by the module integral decision; and
selecting an optimal speech processing program and a parameter setting thereof by the speech processing selection module based on the determination result of the integral decision module for the current scene.
[2]
The method according to claim 1, wherein the microphone for acquiring ambient sound in real time is an omnidirectional microphone or a microphone array.
[3]
The method according to claim 1, wherein a sampling frequency of the sound acquisition module is 16 K.
[4]
The method according to claim 1, wherein the number of values of the extracted feature set representing the attributes of the sound signal is 8.
[5]
The method according to claim 1, wherein the neural network classification module adopts a deep neural network or a delayed neural network containing two hidden layers and 15 neurons in each of the hidden layers.
[6]
The method according to claim 4, wherein the 8 characteristic values are screened from 60 characteristic values.
[7]
The method according to claim 6, wherein an integral analysis of statistical values of the characteristic values is adopted, as well as a Gaussian mixture model, an average influence value algorithm, a sequential direct selection algorithm and a method of evaluating classifier training results for screening of characteristic values.
[8]
The method according to claim 1, wherein a calculation amount of the characteristic values and a calculation amount of the neural network do not exceed 20% of the operational capacity of a speech processor of the cochlear implant.
[9]
A system adopting the method according to one of claims 1 to 8, the system comprising the sound acquisition module, the sound characteristics extraction module, the neural network classification module, the comprehensive decision and speech processing selection module that are connected sequentially, where
the sound acquisition module is configured to acquire the ambient sound in real time using the microphone, and send the series of acquired discrete sound signals to the sound characteristic extraction module;
the sound feature extraction module is configured to process the sound signal transmitted from the sound acquisition module, extract a set of feature values representing attributes of the sound signal, and send the feature values to the classification module of the neural network;
The neural network classification module is configured to classify the set of feature values through a trained neural network after receiving the set of feature values extracted by the sound feature extraction module and sending a classification result to the integral decision module;
The comprehensive decision module is set to determine a current scene through comprehensive analysis after receiving the classification result from the module of classification of the neural network and sending a determination result to the speech processing selection module; and
the speech processing selection module is configured to select an optimal speech processing program and a parameter setting thereof based on the determination result of the integral decision module for the current scene.
类似技术:
公开号 | 公开日 | 专利标题
ES2849124A1|2021-08-13|Artificial cochlea ambient sound sensing method and system
Wang2017|Deep learning reinvents the hearing aid
CN105611477B|2018-06-01|The voice enhancement algorithm that depth and range neutral net are combined in digital deaf-aid
CN106952649A|2017-07-14|Method for distinguishing speek person based on convolutional neural networks and spectrogram
CN107103901B|2019-12-24|Artificial cochlea sound scene recognition system and method
CN108305615A|2018-07-20|A kind of object identifying method and its equipment, storage medium, terminal
WO2019023879A1|2019-02-07|Cough sound recognition method and device, and storage medium
WO2018046595A1|2018-03-15|Classifier ensemble for detection of abnormal heart sounds
WO2017218492A1|2017-12-21|Neural decoding of attentional selection in multi-speaker environments
CN109121057A|2019-01-01|A kind of method and its system of intelligence hearing aid
Li et al.2012|Real-time speaker identification using the AEREAR2 event-based silicon cochlea
CN109448755A|2019-03-08|Artificial cochlea&#39;s auditory scene recognition methods
WO2020087716A1|2020-05-07|Auditory scene recognition method for artificial cochlea
Sharan et al.2017|Cough sound analysis for diagnosing croup in pediatric patients using biologically inspired features
Estrebou et al.2010|Voice recognition based on probabilistic SOM
Liu et al.2010|The use of spike-based representations for hardware audition systems
CN112259107A|2021-01-22|Voiceprint recognition method under meeting scene small sample condition
Huang et al.2019|Audio-replay Attacks Spoofing Detection for Automatic Speaker Verification System
Kiselev et al.2021|Event-driven local gain control on a spiking cochlea sensor
Akimoto et al.2020|POCO: A Voice Spoofing and Liveness Detection Corpus Based on Pop Noise.
Uysal et al.2007|Spike-based feature extraction for noise robust speech recognition using phase synchrony coding
Zabidi et al.2011|Binary particle swarm optimization for feature selection in detection of infants with hypothyroidism
Whitehill et al.2020|Whosecough: In-the-wild cougher verification using multitask learning
Kothapally et al.2017|Speech Detection and Enhancement Using Single Microphone for Distant Speech Applications in Reverberant Environments.
Ghosh et al.2020|Portable Smart-Space Research Interface to Predetermine Environment Acoustics for Cochlear implant and Hearing aid users with CCi-MOBILE
同族专利:
公开号 | 公开日
CN108711419A|2018-10-26|
CN108711419B|2020-07-31|
WO2020024807A1|2020-02-06|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
CN101529929A|2006-09-05|2009-09-09|Gn瑞声达A/S|A hearing aid with histogram based sound environment classification|
CN107103901A|2017-04-03|2017-08-29|浙江诺尔康神经电子科技股份有限公司|Artificial cochlea's sound scenery identifying system and method|
AU2003281984B2|2003-11-24|2009-05-14|Widex A/S|Hearing aid and a method of noise reduction|
CN103456301B|2012-05-28|2019-02-12|中兴通讯股份有限公司|A kind of scene recognition method and device and mobile terminal based on ambient sound|
CN105845127B|2015-01-13|2019-10-01|阿里巴巴集团控股有限公司|Audio recognition method and its system|
CN105611477B|2015-12-27|2018-06-01|北京工业大学|The voice enhancement algorithm that depth and range neutral net are combined in digital deaf-aid|
CN108172238B|2018-01-06|2021-08-13|广州音书科技有限公司|Speech enhancement algorithm based on multiple convolutional neural networks in speech recognition system|
CN108231067A|2018-01-13|2018-06-29|福州大学|Sound scenery recognition methods based on convolutional neural networks and random forest classification|
CN108711419B|2018-07-31|2020-07-31|浙江诺尔康神经电子科技股份有限公司|Environmental sound sensing method and system for cochlear implant|CN108711419B|2018-07-31|2020-07-31|浙江诺尔康神经电子科技股份有限公司|Environmental sound sensing method and system for cochlear implant|
CN109448703B|2018-11-14|2021-05-11|山东师范大学|Audio scene recognition method and system combining deep neural network and topic model|
CN111491245B|2020-03-13|2022-03-04|天津大学|Digital hearing aid sound field identification algorithm based on cyclic neural network and implementation method|
法律状态:
2021-08-13| BA2A| Patent application published|Ref document number: 2849124 Country of ref document: ES Kind code of ref document: A1 Effective date: 20210813 |
优先权:
申请号 | 申请日 | 专利标题
CN201810856692.8A|CN108711419B|2018-07-31|2018-07-31|Environmental sound sensing method and system for cochlear implant|
PCT/CN2019/096648|WO2020024807A1|2018-07-31|2019-07-19|Artificial cochlea ambient sound sensing method and system|
[返回顶部]